PCT 



WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 

C07K 14/29, C12N 15/86, A61K 31/70 



Al 



(11) International Publication Number: WO 98/16554 

(43) International Publication Date: 23 April 1998 (23.04.98) 



(21) International Application Number: PCT/US97/ 19044 

(22) International Filing Date: 17 October 1997 (17.10.97) 



(30) Priority Data: 
08/733,230 



17 October 1996 (17.10.96) 



US 



(71) Applicant: UNIVERSITY OF FLORIDA [US/USJ; 223 Grinter 

Hall, Gainesville, FL 32611 (US). 

(72) Inventors: BARBET, Anthony, F4 8803 S.W. 138th Street, 

Archer, FL 32618 (US). GANTA, Roman, Reddy; 6519 
West Newberry Road #802, Gainesville, FL 32605 (US). 
MCGUIRE, Travis, C; S.W. 920 Crestview, Pullman, WA 
99163 (US). BURRIDGE, Michael U 10021 S.W. 67th 
Drive, Gainesville, FL 32608 (US). NYIKA, Aceme; House 
6282, Unit J, Chitungwiza, Harare (ZW). RURANGIRWA, 
Fred, R; 2065 N.W. Turner Drive, Pullman, WA 99163 
(US). MAHAN, Suman, M.; 71 Olwyn Avenue, Str- 
rathaven, Harare (ZW). 

(74) Agents: SALIWANCHHC, David, R. et al.; Saliwanchik, 
Lloyd & SaUwanchik, Suite A-l, 2421 N.W. 41st Street, 
Gainesville, FL 32606-6669 (US). 



(81) Designated States: AL, AU, BA, BB, BG, BR, CA, CN, CU 
CZ, EE, GE, HU, ID, IL, IS, JP, KP, KR, LC, LK, LR, LT, 
LV, MG, MK, MN, MX, NO, NZ, PL, RO, SD, SG, SI, SK, 
SL, TR, TT, UA, UZ, VN, YU, ZW, ARIPO patent (GH, 
KB, LS, MW, SD, SZ, UG, ZW), Eurasian patent (AM, AZ, 
BY, KG, KZ, MD, RU, 17, TM), European patent (AT, BE, 
CH, DE, DK, ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, 
PT, SE), OAPI patent (BF, BJ, CF, CG, a, CM. GA, GN 
ML, MR, NE, SN, TD, TG). 



Published 

With international search report. 
Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Title: NUCLEIC ACID VACCINES AGAINST RICKETTSIAL DISEASES AND METHODS OF USE 
(57) Abstract 



Described arc nucelic acid vaccines containing genes to protect animals or humans against rickettsial 
polypeptides and methods of using these polypeptides to detect antibodies to pathogens. 



diseases. Also described are 



BEST AVAILABLE COPY 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


AM 


Armenia 


FI 


AT 


Austria 


FR 


AU 


Australia 


GA 


AZ 


Azerbaijan 


GB 


BA 


Bosnia and Herzegovina 


GE 


BB 


Barbados 


GH 


BE 


Belgium 


GN 


BF 


Burkina Paso 


GR 


BG 


Bulgaria 


HU 


BJ 


Benin 


IE 


BR 


Brazil 


IL 


BY 


Belarus 


IS 


CA 


Canada 


IT 


CF 


Central African Republic 


JP 


CG 


Congo 


KB 


CH 


Switzerland 


KG 


a 


Cote d'lvoire 


KP 


CM 


Cameroon 




CN 


China 


KR 


cu 


Cuba 


KZ 


cz 


Czech Republic 


LC 


DE 


Germany 


U 


DK 


Denmark 


LK 


EE 


Estonia 


LR 



Spain 
Finland 
France 
Gabon 

United Kingdom 

Georgia 

Ghana 

Guinea 

Greece 

Hungary 

Ireland 

Israel 

Iceland 

Italy 

Japan 

Kenya 

Kyrgyzstan 

Democratic People's 

Republic of Korea 

Republic of Korea 



Saint Lucia 



Sri 
Liberia 



LS 
LT 
LU 
LV 
MC 
MD 
MG 
MK 

ML 

MN 

MR 

MW 

MX 

NE 

NL 

NO 

NZ 

PL 

PT 

RO 

RU 

SD 

SE 

SG 



Lesotho 


SI 


Slovenia 


Lithuania 


SK 


Slovakia 


Luxembourg 


SN 


Senegal 


Latvia 


sz 


Swaziland 


Monaco 


TD 


Chad 


Republic of Moldova 


TG 


Togo 


Madagascar 


TJ 


Tipkistan 


The former Yugoslav 


TM 


Turkmenistan 


Republic of Macedonia 


TR 


Turkey 


Man 


TT 


Trinidad and Tobago 


Mongolia 


UA 


Ukraine 


Mauritania 


UG 


Uganda 


Malawi 


VS 


Unked States of America 


Mexico 


uz 


Uzbekistan 


Niger 


VN 


Viet Nam 


Netherlands 


YU 


Yugoslavia 


Norway 


ZW 


Zimbabwe 



New Zealand 
Poland 
Portugal 
Romania 

Russian Federation 

Sudan 

Sweden 



WO 98/16554 



PCT/US97/19044 



DESCRIPTION 

NUCLEIC ACID Vft CCINRS AOATNST 
RICKETTSIAL DISEASES AND M ETHODS OF I I£F r 

This invention was made with government support under US AID Grant No. LAG- 1 328- 
G-00-3030-00. The government has certain rights in this invention. 

Cross-Reference to a Related A pplication 
This is a continuation-in-part of U.S. patent application Serial No. 08/733,230, filed 
October 17, 1996. 

Technical Field 

This invention relates to nucleic acid vaccines for rickettsial diseases of animals, 
including humans. 

Background of the Invention 

The rickettsias are a group of small bacteria commonly transmitted by arthropod vectors 
to man and animals, in which they may cause serious disease. The pathogens causing human 
rickettsial diseases include the agent of epidemic typhus, Rickettsia prowazekii, which has 
resulted in the deaths of millions of people during wartime and natural disasters. The causative 
agents of spotted fever, e.g., Rickettsia rickettsii and Rickettsia conorii, are also included within 
this group. Recently, new types of human rickettsial disease caused by members of the tribe 
Ehrlichiae have been described. Ehrlichiae infect leukocytes and endothelial cells of many 
different mammalian species, some of them causing serious human and veterinary diseases. 
Over 400 cases of human ehrlichiosis, including some fatalities, caused by Ehrlichia chaffeensis 
have now been reported. Clinical signs of human ehrlichiosis are similar to those of Rocky 
Mountain spotted fever, including fever, nausea, vomiting, headache, and rash. 

Heartwater is another infectious disease caused by a rickettsial pathogen, namely 
Cowdria ruminantium, and is transmitted by ticks of the genus Ambfyomma. The disease occurs 
throughout most of Africa and has an estimated endemic area of about 5 million square miles. 
In endemic areas, heartwater is a latent infection in indigenous breeds of cattle that have been 
subjected to centuries of natural selection. The problems occur where the disease contacts 
susceptible or naive cattle and other ruminants. Heartwater has been confirmed to be on the 
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island of Guadeloupe in the Caribbean and is spreading through the Caribbean Islands. The tick 
vectors responsible for spreading this disease are already present on the American mainland and 
threaten the livestock industry in North and South America. 

In acute cases of heartwater, animals exhibit a sudden rise in temperature, signs of 
anorexia, cessation of rumination, and nervous symptoms including staggering, muscle 
twitching, and convulsions. Death usually occurs during these convulsions. Peracute cases of 
the disease occur where the animal collapses and dies in convulsions having shown no 
preliminary symptoms. Mortality is high in susceptible animals. Angora sheep infected with 
the disease have a 90% mortality rate while susceptible cattle strains have up to a 60% mortality 
rate. 

If detected early, tetracycline or chloramphenicol treatment are effective against 
rickettsial infections, but symptoms are similar to numerous other infections and there are no 
satisfactory diagnostic tests (Helmick, C, K. Bernard, L. D'Angelo [1984] J. Infect Dis. 
150:480). 

Animals which have recovered from heartwater are resistant to further homologous, and 
in some cases heterologous, strain challenge. It has similarly been found that persons recovering 
from a rickettsial infection may develop a solid and lasting immunity. Individuals recovered 
from natural infections are often immune to multiple isolates and even species. For example, 
guinea pigs immunized with a recombinant R. conorii protein were partially protected even 
against R rickettsii (Vishwanath, S., G. McDonald, N. Watkins [1990] Infect. Immun. 58:646). 
It is known that there is structural variation in rickettsial antigens between different geographical 
isolates. Thus, a functional recombinant vaccine against multiple isolates would need to contain 
multiple epitopes, eg., protective T and B cell epitopes, shared between isolates. It is believed 
that serum antibodies do not play a significant role in the mechanism of immunity against 
rickettsia (Uilenberg, G. [1983] Advances in Vet. ScL and Comp. Med 27:427-480; Du Plessis, 
Plessis, JJL [1970] Onderstepoort J. Vet. Res. 37(3): 147-150). 

Vaccines based on inactivated or attenuated rickettsiae have been developed against 
certain rickettsial diseases, for example against & prowazekii and R. rickettsii. However, these 
vaccines have major problems or disadvantages, including undesirable toxic reactions, difficulty 
in standardization, and expense (Woodward, T. [1981] "Rickettsial diseases: certain unsettled 
problems in their historical perspective," In Rickettsia and Rickettsial Diseases, W. Burgdorfer 
and R. Anacker, eds., Academic Press, New York, pp. 17-40). 

A vaccine currently used in the control of heartwater is composed of live infected sheep 
blood This vaccine also has several disadvantages. First, expertise is required for the 
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intravenous inoculation techniques required to administer this vaccine. Second, vaccinated 
animals may experience shock and so require daily monitoring for a period after vaccination. 
There is a possibility of death due to shock throughout this monitoring period, and the drugs 
needed to treat any shock induced by vaccination are costly. Third, blood-bome parasites may 
be present in the blood vaccine and be transmitted to the vaccinates. Finally, the blood vaccine 
requires a cold chain to preserve the vaccine. 

Clearly, a safer, more effective vaccine that is easily administered would be particularly 
advantageous. For these reasons, and with the advent of new methods in biotechnology, 
investigators have concentrated recently on the development of new types of vaccines, including 
recombinant vaccines. However, recombinant vaccine antigens must be carefully selected and 
presented to the immune system such that shared epitopes are recognized. These factors have 
contributed to the search for effective vaccines. 

A protective vaccine against rickettsiae that elicits a complete immune response can be 
advantageous. A few antigens which potentially can be useful as vaccines have now been 
identified and sequenced for various pathogenic rickettsia. The genes encoding the antigens and 
that can be employed to recombinantly produce those antigen have also been identified and 
sequenced. Certain protective antigens identified for A rickettsii, R. conorii, and R. prowazddi 
(e.g., rOmpA and rOmpB) are large (>100 kDa), dependent on retention of native conformation 
for protective efficacy, but are often degraded when produced in recombinant systems. This 
presents technical and quality-control problems if purified recombinant proteins are to be 
included in a vaccine. The mode of presentation of a recombinant antigen to the immune system 
can also be an important factor in the immune response. 

Nucleic acid vaccination has been shown to induce protective immune responses in non- 
viral systems and in diverse animal species (Special Conference Issue, WHO meeting on nucleic 
acid vaccines [1994] Vaccine 12:1491). Nucleic acid vaccination has induced cytotoxic 
lymphocyte (CTL), T-helper 1, and antibody responses, and has been shown to be protective 
against disease (Ulmer, J., J. Donelly, S. Parker et al. [1993] Science 259:1745). For example, 
direct intramuscular injection of mice with DNA encoding the influenza nucleoprotein caused 
the production of high titer antibodies, nucleoprotein-specific CTLs, and protection against viral 
challenge. Immunization of mice with plastnid DNA encoding the Plasmodium yoelii 
circumsporozoite protein induced high antibody titers against malaria sporozoites and CTLs, and 
protection against challenge infection (Sedegah, M, R. Hedstrom, P. Hobart, S. Hoffman [1994] 
Proc. Natl. Acad. ScL USA 91:9866). Cattle immunized with plasmids encoding bovine 
herpesvirus 1 (BHV-1) glycoprotein IV developed neutralizing antibody and were partially 
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protected (Cox, G., T. Zamb, L. Babiuk [1993] J. Virol 67:5664). However, it has been a 
question in the field of immunization whether the recently discovered technology of nucleic acid 
vaccines can provide improved protection against an antigenic drift variant. Moreover, it has 
not heretofore been recognized or suggested that nucleic acid vaccines may be successful to 
protect against rickettsial disease or that a major surface protein conserved m rickettsia was 
protective against disease. 



Brief Summary of the Invention 

Disclosed and claimed here are novel vaccines for conferring immunity to rickettsia 
infection, including Cowdria ruminantium causing heartwater. Also disclosed are novel nucleic 
acid compositions and methods of using those compositions, including to confer immunity in 
a susceptible host Also disclosed are novel materials and methods for diagnosing infections by 
Ehrlichia in humans or animals. 

One aspect of the subject invention concerns a nucleic acid, e.&, DNA or mRNA, 
vaccine containing the major antigenic protein 1 gene (MAPI) or the major antigenic protein 
2 gene (MAP2) of rickettsial pathogens. In one embodiment, the nucleic acid vaccines can be 
driven by the human cytomegalovirus (HCMV) enhancer-promoter. In studies immunizing mice 
by intramuscular injection of a DNA vaccine composition according to the subject invention, 
immunized mice seroconverted and reacted with MAPI in antigen blots. Splenocytes from 
immunized mice, but not from control mice immunized with vector only, proliferated in 
response to recombinant MAPI and rickettsial antigens in in vitro lymphocyte proliferation tests. 
In experiments testing different DNA vaccine dose regimens, increased survival rates as 
compared to controls were observed on challenge with rickettsia. Accordingly, the subject 
invention concerns the discovery that DNA vaccines can induce protective immunity against 
rickettsial disease or death resulting therefrom. 

Brief Description of the Drawing 

Figures 1A-1C show a comparison of the amino acid sequences from alignment of the 
three rickettsial proteins, namely, Cowdria ruminantium (Cr.), Ehrlichia chaffeensis (£c.), and 
Anaplasma marginale (A.nu). 

Figures 2A-2C shows the DNA sequence of the 28 kDa gene locus cloned from E. 
chaffeensis (Fig. 2A-2B) and E. canis (Fig. 2C). One letter amino acid codes for the deduced 
protein sequences are presented below the nucleotide sequence. The proposed sigma-70-like 
promoter sequences (38) are presented in bold and underlined text as -10 and -35 (consensus -35 
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and -10 sequences are TTGACA and TATAAT, nspectively). Similarly, consensus ribosomal 
binding sites and transcription terminator sequences (bold letter sequence) are identified. G-rich 
regions identified in the E. chaffeensis sequence are underlined. The conserved sequences from 
within die coding regions selected for RT-PCR assay are identified with italics and underlined 
5 text 

Figure 3A shows the complete sequence of the MAP2 homolog of Ehrlichia cams. Hie 
arrow (-*) represents the predicted start of the mature protein. The asterisk (*) represents the 
stop codon. Underlined nucleotides 5' to the open reading frame with -35 and -10 below 
represent predicted promoter sequences. Double underlined nucleotides represent the predicted 

10 ribosomal binding site. Underlined nucleotides 3* to the open reading frame represent possible 

transcription termination sequences. 

Figure 3B shows the complete sequence of the MAP2 homolog of Ehrlichia chaffeensis. 
The arrow (-») represents the predicted start of the mature protein. The asterisk (*) represents 
the stop codon. Underlined nucleotides 5' to the open reading frame with -35 and -10 below 

15 represent predicted promoter sequences. Double underlined nucleotides represent the predicted 

ribosomal binding site. Underlined nucleotides V to the open reading frame represent possible 
transcription termination sequences. 



Brief Description of the Sequences 
20 SEQ ID NO, 1 is the coding sequence of the MAPI gene from Cowdria ruminantium 

(Highway isolate). 

SEQ ID NO, 2 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 1. 

SEQ ID NO. 3 is the coding sequence of the MAPI gene from Ehrlichia chaffeensis. 

SEQ ID NO. 4 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 3. 
25 SEQ ID NO. 5 is the Anaplasma marginale MSP4 gene coding sequence. 

SEQ ID NO. 6 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 5. 

SEQ ID NO. 7 is a partial coding sequence of the VSA1 gene from Ehrlichia 
chaffeensis, also shown in Figures 2A-2B. 

SEQ ID NO. 8 is the coding sequence of the VSA2 gene from Ehrlichia chaffeensis, 
30 also shown in Figures 2A-2B. 

SEQ ID NO. 9 is the coding sequence of the VSA3 gene from Ehrlichia chaffeensis, 
also shown in Figures 2A-2B. 

SEQ ID NO. 10 is the coding sequence of the VSA4 gene from Ehrlichia chaffeensis, 
also shown in Figures 2A-2B. 
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SEQ ID NO. 11 is a partial coding sequence of the VSA5 gene from Ehrlichia 
chqffeensiSy also shown in Figures 2A-2B. 

SEQ ID NO. 12 is the coding sequence of the VSA1 gene from Ehrlichia canis, also 
shown in Figure 2C. 

5 SEQ ID NO. 13 is a partial coding sequence of the VSA2 gene from Ehrlichia canis, 

also shown in Figure 2C 

SEQ ID NO. 14 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 7, 
also shown in Figures 2A-2B. 

SEQ ID NO. 15 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 8, 
1 0 also shown in Figures 2A-2B. 

SEQ ID NO. 16 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 9, 
also shown in Figures 2A-2B. 

SEQ ID NO. 17 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 10, 
also shown in Figures 2A-2B. 
15 SEQ ID NO. 18 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 1 1, 

also shown in Figures 2A-2B. 

SEQ ID NO. 19 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 12, 
also shown in Figure 2C. 

SEQ ID NO. 20 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 13, 
20 also shown in Figure 2C. 

SEQ ID NO. 21 is the coding sequence of the MAP2 gene from Ehrlichia canis, also 
shown in Figure 3 A. 

SEQ ID NO. 22 is the coding sequence of the MAP2 gene from Ehrlichia chqffeensis, 
also shown in Figure 3B. 

25 SEQ ID NO. 23 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 21 , 

also shown in Figure 3 A. 

SEQ ID NO. 24 is the polypeptide encoded by the polynucleotide of SEQ ID NO. 22, 
also shown in Figure 3B. 

30 Detailed Dis closure of the Inventing 

In one embodiment, die subject invention concerns a novel strategy, termed nucleic acid 
vaccination, for eliciting an immune response protective against rickettsial disease. The subject 
invention also concerns novel compositions that can be employed according to this novel 
strategy for eliciting a protective immune response. According to the subject invention, 
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recombinant plasmid DNA or mRNA encoding an antigen of interest is inoculated directly into 
the human or animal host where the antigen is expressed and an immune response induced. 
Advantageously, problems of protein purification, as can be encountered with antigen delivery 
using live vectors, can be virtually eliminated by employing the compositions or methods 
according to the subject invention. Unlike live vector delivery, the subject invention can provide 
a further advantage in that the DNA or RNA does not replicate in the host, but remains episomal 
with gene expression directed for as long as 19 months or more post-injection. See, for example, 
Wolff, JA., JJ. Ludflce, G. Acsadi, P. Williams, A. Jam (1992) Hum. Mol. Genet. 1:363. A 
complete immune response can be obtained as recombinant antigen is synthesized intracellularly 
and presented to the host immune system in the context of autologous class I and class n MHC 
molecules. 

In one embodiment, the subject invention concerns nucleic acids and compositions 
comprising those nucleic acids that can be effective in protecting an animal from disease or 
death caused by rickettsia. For example, a nucleic acid vaccine of the subject invention has been 
shown to be protective against Cowdria ruminantium, the causative agent of heartwater in 
domestic ruminants. Accordingly, DNA sequences of rickettsial genes, e.g y MAPI or 
homologues thereof, can be used as nucleic acid vaccines against human and animal rickettsial 
diseases. The MAPI gene used to obtain this protection is also present in other rickettsiae 
including Anaplasma marginale, Ehrlichia amis, and in a causative agent of human ehrlichiosis, 
Ehrlichia chaffeensis (van Vliet, A., F. Jongejan, M. van Kleef; B. van der Zeijst [1994] Infect. 
Immuru 62: 145 1). The MAPI gene or a MAPMike gene can also be found in certain Rickettsia 
spp. MAPI-like genes from Ehrlichia chaffeensis and Ehrlichia canis have now been cloned 
and sequenced These MAP-1 homologs are also referred to herein as Variable Surface Antigen 
(VSA) genes. 

The present invention also concerns polynucleotides encoding MAP2 or MAP2 
homologs from Ehrlichia canis and Ehrlichia chaffeensis. MAP2 polynucleotide sequences of 
the invention can be used as vaccine compositions and in diagnostic assays. The polynucleotides 
can also be used to produce the MAP2 polypeptides encoded thereby. 

Compositions comprising the subject polynucleotides can include appropriate nucleic 
acid vaccine vectors (plasmids), which are commercially available (eg., Vical, San Diego, CA). 
In addition, the compositions can include a pharmaceutical^ acceptable carrier, e.g, saline. The 
pharmaceutically acceptable carriers are well known in the art and also are commercially 
available. For example, such acceptable carriers are described in E.W. Martin's Remington's 
Pharmaceutical Science, Mack Publishing Company, Easton, PA. 
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The subject invention also concerns polypeptides encoded by the subject 
polynucleotides. Specifically exemplified are the polypeptides encoded by the MAP- 1 and VSA 
genes of C rumimontium, E chaffeensis, E. canis and the MP4 gene of Anaplasma marginale. 
Polypeptides uncoded by E. chaffeensis and E. canis MAP2 genes are also exemplified herein. 

5 

Also encompassed within the scope of the present invention are fragments and variants 
of the exemplified polynucleotides. Variants include polynucleotides and/or polypeptides 
having base or amino acid additions, deletions and substitutions in the sequence of the subject 
molecule so long as those variants have substantially the same activity or serologic reactivity 

10 as the native molecules. Also included are allelic variants of the subject polynucleotides. The 

polypeptides and peptides of the present invention can be used to raise antibodies that are 
reactive with the polypeptides disclosed herein. The polypeptides and peptides can also be used 
as molecular weight markers. 

Another aspect of the subject invention concerns antibodies reactive with MAP-1 and 

15 MAP2 polypeptides disclosed herein. Antibodies can be monoclonal or polyclonal and can be 

produced using standard techniques known in the art Antibodies of the invention can be used 
in diagnostic and therapeutic applications. 

In a specific embodiment, the subject invention concerns a DNA vaccine (e.g., 
VCL1010/MAP1) containing the major antigenic protein 1 gene (MAPI) driven by the human 

20 cytomegalovirus (HCMV) enhancer-promoter injected intramuscularly into 8-10 week-old 
female DBA/2 mice after treating them with 50 ul/muscle of 0.5% bupivacaine 3 days 
previously. Up to 75% of the VCL 1 0 1 0/MAP 1 -immunized mice seroconverted and reacted with 
MAPI in antigen blots. Spienocytes from immunized mice, but not from control mice 
immunized with VCL1010 DNA (plasmid vector, Vical, San Diego) proliferated in response to 

25 recombinant MAPI and C. rwninantium antigens in in vitro lymphocyte proliferation tests. 

These proliferating cells from mice immunized with VCL1010/MAP1 DNA secreted IFN- 
gamma and JL-2 at concentrations ranging from 610 pg/ml and 152 pg/ml to 1290 pg/ml and 310 
pg/ml, respectively. In experiments testing different VCL1010/MAP1 DNA vaccine dose 
regimens (25-100 ug/dose, 2 or 4 immunizations), survival rates of 23% to 88% (35/92 

30 survivors/total in all VCL1010/MAP1 immunized groups) were observed on challenge with 

30LD50 of C rwninantium. Survival rates of 0% to 3% (1/144 survivors/total in all control 
groups) were recorded for control mice immunized similarly with VCL1010 DNA or saline. 
Accordingly, the subject invention concerns the discovery that the gene encoding the MAPI 
protein can induce protective immunity as a DNA vaccine against rickettsial disease. 
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The nucleic acid sequences described herein have other uses as well. For example, the 
nucleic acids of the subject invention can be useful as probes to identify complementary 
sequences within other nucleic acid molecules or genomes. Such use of probes can be applied 
to identify or distinguish infectious strains of organisms in diagnostic procedures or in rickettsial 
research where identification of particular organisms or strains is needed. As is well known in 
the art, probes can be made by labeling the nucleic acid sequences of interest according to 
accepted nucleic acid labeling procedures and techniques. A person of ordinary skill in the art 
would recognize that variations or fragments of the disclosed sequences which can specifically 
and selectively hybridize to the DNA of rickettsia can also function as a probe. It is within the 
ordinary skill of persons in the art, and does not require undue experimentation in view of the 
description provided herein, to determine whether a segment of the claimed DNA sequences is 
a fragment or variant which has characteristics of the full sequence, e.g., whether it specifically 
and selectively hybridizes or can confer protection against rickettsial infection in accordance 
with the subject invention. In addition, with the benefit of the subject disclosure describing the 
specific sequences, it is within the ordinary skill of those persons in the art to label hybridizing 
sequences to produce a probe. 

It is also well known in the art that restriction enzymes can be used to obtain functional 
fragments of the subject DNA sequences. For example, Bal3 1 exonuclease can be conveniently 
used for time-controlled limited digestion of DNA (commonly referred to as "erase-a-base M 
procedures). See, for example, Maniatise* al (1982) Molecular Cloning: A Laboratory Manual, 
Cold Spring Harbor Laboratory, New York; Wei etai (1983)7. Biol Chem. 258:13006-13512. 

In addition, the nucleic acid sequences of the subject invention can be used as molecular 
weight markers in nucleic acid analysis procedures. 

Following are examples which illustrate procedures for practicing the invention. These 
examples should not be construed as limiting. All percentages are by weight and all solvent 
mixture proportions are by volume unless otherwise noted. 

Example 1 

A nucleic acid vaccine construct was tested in animals for its ability to protect against 
death caused by infection with the rickettsia Cowdria ruminantium. Hie vaccine construct tested 
was the MAPI gene of C ruminantium inserted into plasmid VCL1010 (Vkal, San Diego) under 
control of the human cytomegalovirus promoter-enhancer and intron A. In this study, seven 
groups containing 10 mice each were injected twice at 2-week intervals with either 100, 75, 50, 
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or 25 jig VCL1010/MAP1 DNA (V/M in Table 1 below), or 100, 50 ug VCL1010 DNA (V in 
Table 1) or saline (Sal.), respectively. Two weeks after the last injections, 8 mice/group were 
challenged with 30LD50 of C. ruminantium and clinical symptoms and survival monitored. The 
remaining 2 mice/group were not challenged and were used for lymphocyte proliferation tests 
and cytokine measurements. Hie results of the study are summarized in table 1, below: 



Table 1 




100 ug 

V/M 


75 Mg 
V/M 


50 ug 
V/M 


25 }ig 
V/M 


100 ug 
V 


50 ug 
V 


Sal. 


Survived 


5 


7 


5 


3 


0 


0 


0 


Died 


3 


1 


3 


5 


8 


8 


8 



The VCL1010/MAP1 nucleic acid vaccine increased survival on challenge in all groups, with 
a total of 20/30 mice surviving compared to 0/24 in the control groups. 

This study was repeated with another 6 groups, each containing 33 mice (a total of 198 
mice). Three groups received 75 ug VCL1010/MAP1 DNA or VCL1010 DNA or saline (4 
injections in all cases). Two weeks after the last injection, 30 mice/group were challenged with 
30LD50 of C ruminantium and 3 mice/group were sacrificed for lymphocyte proliferation tests 
and cytokine measurements. The results of this study are summarized in Table 2, below: 



Table 2 




V/M2inj. 


V2inj. 


Sal. 2 inj. 


V/M 4 inj. 


V4ihj. 


Sal. 4 inj. 


Survived 


7 


0 


0 


8 


0 


1 


Died* 


23 


30 


30 


22 


30 


29 



*In mice that died in both V/M groups, there was an increase in mean survival time of 
approximately 4 days compared to the controls (p<0.05). 

Again, as summarized in Table 2, the VCL1010/MAP1 DNA vaccine increased the 
numbers of mice surviving in both immunized groups, although there was no apparent benefit 
of 2 additional injections. In these two experiments, there were a cumulative total of 35/92 
(38%) surviving mice in groups receiving the VCL1010/MAP1 DNA vaccine compared to 1/144 
(0.7%) surviving mice in the control groups. In both immunization and challenge trials 
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described above, splenocytes from VCL1010/MAP1 immunized mice, but not from control 
mice, specifically proliferated to recombinant MAPI protein and to C. ruminantium in 
lymphocyte proliferation tests. These proliferating splenocytes secreted IL-2 and gamma- 
interferon at concentrations up to 310 and 1290 pg/ml respectively. These data show that 
5 protection against rickettsial infections can be achieved with a DNA vaccine. In addition, these 

experiments show MAPI -related proteins as vaccine targets. 

Example 2 

The MAPI protein of C. ruminantium has significant similarity to MSP4 of A. 

1 0 marginale, and related molecules may also be presenting other rickettsial pathogens. To prove 

this, we used primers based on regions conserved between C. ruminantium and A. marginale in 
PCR to clone a MAPI -like gene from E. chaffeensis. The amino acid sequence derived from 
the cloned E. chaffeensis MAPI -like gene, and alignment with the corresponding genes of C 
ruminantium and A. marginale is shown in Figure 1. We have now identified the regions of 

15 MAPI-like genes which are highly conserved between Ehrlichia, Cowdria, and Anaplasma and 

which can allow cloning of the analogous genes from other rickettsiae. 

Example 3 - Coning and sequence analysis of MAPI homolopue genes of F. chaffeensis and 
E. canis 

20 Genes homologous to the major surface protein of C. ruminantium MAPI were cloned 

from E. chaffeensis and E. canis by using PCR cloning strategies. The cloned segments 
represent a 4.6 kb genomic locus of E. chaffeensis and a 1 .6 kb locus of E. canis. DNA sequence 
generated from these clones was assembled and is presented along with the deduced amino acid 
sequence in Figures 2A-2B (SEQ ID NOs. 7-1 1 and 14-18) and Figure 2C (SEQ ID NOs. 12-13 

25 and 19-20). Significant features of the DNA include five very similar but nonidentical open 

reading frames (ORFs) forE. chaffeensis and two very similar, nonidentical ORFs for the E. 
canis cloned locus. The ORFs for both Ehrlichia spp. are separated by noncoding sequences 
ranging from 264 to 310 base pairs. The noncoding sequences have a higher A+T content 
(71.6% for E. chaffeensis and 76.1% for E. canis) than do the coding sequences (63.5% for E. 

30 chaffeensis and 68.0% for E. canis). A G-rich region -200 bases upstream from the initiation 
codon, sigma-70-like promoter sequences, putative ribosome binding sites (RBS), termination 
codons, and palindromic sequences near the termination codons are found in each of the E. 
chaffeensis noncoding sequences. The E. canis noncoding sequence has the same feature except 
for the G-rich region (Figure 2C; SEQ ID NOs. 12-13 and 19-20). 
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Sequence comparisons of the ORFs at the nucleotide and translated amino acid levels 
revealed a high degree of similarity between them. The similarity spanned the entire coding 
sequences, except in three regions where notable sequence variations were observed including 
some deletions/insertions (Variable Regions I, II and ID). Despite the similarities, no two ORFs 
are identical. The cloned ORF 2, 3 and 4 of E. chaffeensis have complete coding sequences. 
The ORF1 is a partial gene having only 143 amino acids at the C-terminus whereas the ORFS 
is nearly complete but lacks 5-7 amino acids and a terniination codon. The cloned ORF2 of E. 
cants also is a partial gene lacking a part of the C-terminal sequence. The overall similarity 
between different ORFs at the amino acid level is 56.0% to 85.4% for R chaffeensis, whereas 
for E, amis it is 53.3%. The similarity of E. chaffeensis ORFs to the MAPI coding sequences 
reported for C. ruminantium isolates ranged from 55.5% to 66.7%, while for E. canis to C. 
ruminantium it is 48.5% to 54.2%. Due to their high degree of similarity to MAPI surface 
antigen genes of C ruminantium and since they are nonidentical to each other, the E. chaffeensis 
and E. canis ORFs are referred to herein as putative Variable Surface Antigen (VSA) genes. The 
apparent molecular masses of the predicted mature proteins of E. chaffeensis were 28.75 kDa 
for VSA2, 27.78 for VSA3, and 27.95 for VS A4, while E. canis VSA1 was slightly higher at 
29.03 kDa. The first 25 amino acids in each VSA coding sequence were eliminated when 
calculating the protein size since they markedly resembled the signal sequence of C 
ruminantium MAPI and presumably would be absent from the mature protein. Predicted protein 
sizes for E. chaffeensis VSA1 and VSA5, and E. canis VSA2 were not calculated since the 
complete genes were not cloned. 

It should be understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit and purview of this 
application and the scope of the appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 



Applicant Name(s) 
Street Address 



University of Florida 

223 Grinter Hall 

Gainesville 

Florida 

US 

32611 

(352) 392-8929 



City 



State/Province 



Country 



Postal Code/Zip 



Phone number 



Fax: (352) 392-6600 



(ii) TITLE OF INVENTION: Nucleic Acid Vaccines Against 
Rickettsial Diseases and Methods of Use 

(iii) NUMBER OF SEQUENCES: 24 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Saliwanchik, Lloyd Saliwanchik 

(B) STREET: 2421 N.W. 41st Street, Suite A-l 

(C) CITY: Gainesville 

(D) STATE: FL 

(E) COUNTRY: USA 

(F) ZIP: 32606 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentlh Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT 

(B) FILING DATE: 17 October 1997 

(C) CLASSIFICATION: 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: Pace, Doran R. 

(B) REGISTRATION NUMBER: 38,261 

(C) REFERENCE/DOCKET NUMBER: UF-167C1 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 352-375-8100 

(B) TELEFAX: 352-372-5800 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 864 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear " 
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(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

■(A) NAME/KEY: CDS 
(B) LOCATION: 1..861 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATG AAT TGC AAG AAA ATT TTT ATC ACA ACT ACA CTA ATA TCA TTA GTG 
Met Asn. Cys Lys Lys lie Phe lie Thr Ser Thr Leu He Ser Leu Val 
15 10 is 

TCA TTT TTA CCT GGT GTG TCC TTT TCT GAT GTA ATA CAG GAA GAC AGC 
Ser Phe Leu Pro Gly Val Ser Phe Ser Asp Val He Gin Glu Asp Ser 
20 25 30 

AAC CCA GCA GGC AGT GTT TAC ATT AGC GCA AAA TAC ATG CCA ACT GCA 
Asn Pro Ala Gly Ser Val Tyr He Ser Ala Lys Tyr Met Pro Thr Ala 
35 40 45 

TCA CAT TTT GGT AAA ATG TCA ATC AAA GAA GAT TCA AAA AAT ACT CAA 
Ser His Phe Gly Lys Met Ser lie Lys Glu Asp Ser Lys Asn Thr Gin 
50 55 go 

ACG GTA TTT GGT CTA AAA AAA GAT TGG GAT GGC GTT AAA ACA CCA TCA 
Thr Val Phe Gly Leu Lys Lys Asp Trp Asp Gly Val Lys Thr Pro Ser 
65 70 75 eo 



TTC AGA TAT GAA AAC AAT CCG TTT TTA GGT TTC GOT GGA GCA ATT GGG 
Phe Arg Tyr Glu Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala He Gly 
100 105 no 



TTT GAT GTA AAA AAC CTA GGT GGC AAC TAT AAA AAC AAC GCA CAC ATG 
Phe Asp Val Lys Asn Leu Gly Gly Asn Tyr Lys Asn Asn Ala His Met 
130 135 140 

TAC TGT GCT TTA GAT ACA GCA GCA CAA AAT AGC ACT AAT GGC GCA GGA 
Tyr Cys Ala Leu Asp Thr Ala Ala Gin Asn Ser Thr Asn Gly Ala Gly 
145 15 ° 155 i 60 



48 



96 



144 



192 



240 



GAT TCT AGC AAT ACT AAT TCT ACA ATT TTT ACT GAA AAA GAC TAT TCT 288 
Asp Ser Ser Asn Thr Asn Ser Thr He Phe Thr Glu Lys Asp Tyr Ser 
85 90 95 



336 



TAC TCA ATG AAT GGA CCA AGA ATA GAG TTC GAA GTA TCC TAT GAA ACT 384 
Tyr Ser Met Asn Gly Pro Arg He Glu Phe Glu Val Ser Tyr Glu Thr 
115 120 125 



432 



480 



TTA ACT ACA TCT GTT ATG GTA AAA AAC GAA AAT TTA ACA AAT ATA TCA 528 
Leu Thr Thr Ser Val Met Val Lys Asn Glu Asn Leu Thr Asn lie Ser 
165 170 175 

TTA ATG TTA AAT GCG TGT TAT GAT ATC ATG CTT GAT GGA ATA CCA GTT 576 
Leu Met Leu Asn Ala Cys Tyr Asp He Met Leu Asp Gly He Pro Val 
160 185 190 
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TCT CCA TAT GTA TGT GCA GGT ATT GGC ACT GAC TTA GTG TCA GTA ATT 
Ser Pro Tyr Val Cys Ala Gly lie Gly Thr Asp Leu Val Ser Val He 
- 200 205 

AAT GCT ACA AAT CCT AAA TTA TCT TAT CAA GGA AAG CTA GGC ATA AGT 
Asn Ala Thr Asn Pro Lys Leu Ser Tyr Gin Gly Lys Leu Gly He Ser 
210 215 220 

TAC TCA ATC AAT TCT GAA GCT TCT ATC TTT ATC GGT GGA CAT TTC CAT 
Tyr Ser He Asn Ser Glu Ala Ser He Phe He Gly Gly His Phe His 
225 23 ° 235 2 4o 

AGA GTT ATA GGT AAT GAA TTT AAA GAT ATT GCT ACC TTA AAA ATA TTT 
Arg Val He Gly Asn Glu Phe Lys Asp He Ala Thr Leu Lys He Phe 
245 250 255 

ACT TCA AAA ACA GGA ATA TCT AAT CCT GGC TTT GCA TCA GCA ACA CTT 
Thr Ser Lys Thr Gly He Ser Asn Pro Gly Phe Ala Ser Ala Thr Leu 
260 265 270 

GAT GTT TGT CAC TTT GGT ATA GAA ATT GGA GGA AGG TTT GTA TTT 
Asp Val Cys His Phe Gly He Glu He Gly Gly Arg Phe Val Phe 
275 280 285 

TAA 



(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 287 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Met Asn Cys Lys Lys He Phe He Thr Ser Thr Leu He Ser Leu Val 
1 5 io 15 

Ser Phe Leu Pro Gly Val Ser Phe Ser Asp Val He Gin Glu Asp Ser 
20 25 30 

Asn Pro Ala Gly Ser Val Tyr He Ser Ala Lys Tyr Met Pro Thr Ala 
35 40 45 

Ser His Phe Gly Lys Met Ser He Lys Glu Asp Ser Lys Asn Thr Gin 
50 55 60 

Thr Val Phe Gly Leu Lys Lys Asp Trp Asp Gly Val Lys Thr Pro Ser 
65 70 75 80 

Asp Ser Ser Asn Thr Asn Ser Thr He Phe Thr Glu Lys Asp Tyr Ser 
85 90 95 



624 



672 



720 



768 



816 



861 



864 
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Phe Arg Tyr Glu Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala He Gly 
100 105 no 

Tyr Ser Met Asn Gly Pro Arg He Glu Phe Glu Val Ser Tyr Glu Thr 
115 120 125 

Phe Asp Val Lys Asn Leu Gly Gly Asn Tyr Lys Asn Asn Ala His Met 
130 135 140 

Tyr Cys Ala Leu Asp Thr Ala Ala Gin Asn Ser Thr Asn Gly Ala Gly 
145 150 155 160 

Leu Thr Thr Ser Val Met Val Lys Asn Glu Asn Leu Thr Asn He Ser 
165 170 175 

Leu Met Leu Asn Ala Cys Tyr Asp He Met Leu Asp Gly lie Pro Val 
180 185 190 

Ser Pro Tyr Val Cys Ala Gly He Gly Thr Asp Leu Val Ser Val lie 
195 200 205 

Asn Ala Thr Asn Pro Lys Leu Ser Tyr Gin Gly Lys Leu Gly He Ser 
210 215 220 

Tyr Ser He Asn Ser Glu Ala Ser He Phe He Gly Gly His Phe His 
225 230 235 " 240 

Arg Val He Gly Asn Glu Phe Lys Asp He Ala Thr Leu Lys He Phe 
245 250 255 

Thr Ser Lys Thr Gly He Ser Asn Pro Gly Phe Ala Ser Ala Thr Leu 
260 265 270 

Asp Val Cys His Phe Gly He Glu He Gly Gly Arg Phe Val Phe 
275 280 285 



<2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 842 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix> FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..840 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
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ATG AAT TAC AAA AAA AGT TTC ATA ACA GCG ATT GAT ATC ATT AAT ATC 48 
Met Aan Tyr Lys Lys Ser Phe He Thr Ala He Asp He He Asn He 
290 295 300 

CTT CTC TTA CCT GGA GTA TCA TTT TCC GAC CCA AGG CAG GTA GTG GTC 96 
Leu Leu Leu Pro Gly Val Ser Phe Ser Asp Pro Arg Gin Val Val Val 
305 310 315 

ATT AAC GGT AAT TTC TAC ATC AGT GGA AAA TAC GAT GCC AAG GCT TCG 144 
He Asn Gly Asn Phe Tyr He Ser Gly Lys Tyr Asp Ala Lys Ala Ser 
320 v 325 330 335 

CAT TTT GGA GTA TTC TCT GCT AAG GAA GAA AGA AAT ACA ACA GTT GGA 192 
His Phe Gly Val Phe Ser Ala Lys Glu Glu Arg Asn Thr Thr Val Gly 
340 345 350 

GTG TTT GGA CTG AAG CAA AAT TGG GAC GGA AGC GCA ATA TCC AAC TCC 240 
Val Phe Gly Leu Lys Gin Asn Trp Asp Gly Ser Ala He Ser Asn Ser 
355 360 365 

TCC CCA AAC GAT GTA TTC ACT GTC TCA AAT TAT TCA TTT AAA TAT GAA 288 
Ser Pro Asn Asp Val Phe Thr Val Ser Asn Tyr Ser Phe Lys Tyr Glu 
370 375 380 

AAC AAC CCG TTT TTA GGT TTT GCA GGA GCT ATT GGT TAC TCA ATG GAT 336 
Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala He Gly Tyr Ser Met Asp 
385 390 395 

GGT CCA AGA ATA GAG CTT GAA GTA TCT TAT GAA ACA TTT GAT GTA AAA 384 
Gly Pro Arg He Glu Leu Glu Val Ser Tyr Glu Thr Phe Asp Val Lys 
400 405 410 415 

AAT CAA GGT AAC AAT TAT AAG AAT GAA GCA CAT AGA TAT TGT GCT CTA 432 
Asn Gin Gly Asn Asn Tyr Lys Asn Glu Ala His Arg Tyr Cys Ala Leu 
420 425 " 430 

TCC CAT AAC TCA GCA GCA GAC ATG AGT AGT GCA AGT AAT AAT TTT GTC 480 
Ser His Asn Ser Ala Ala Asp Met Ser Ser Ala Ser Asn Asn Phe Val 
435 440 445 

TTT CTA AAA AAT GAA GGA TTA CTT GAC ATA TCA TTT ATG CTG AAC GCA 528 
Phe Leu Lys Asn Glu Gly Leu Leu Asp He Ser Phe Met Leu Asn Ala 
450 455 46O 

TGC TAT GAC GTA GTA GGC GAA GGC ATA CCT TTT TCT CCT TAT ATA TGC 576 
Cys Tyr Asp Val Val Gly Glu Gly He Pro Phe Ser Pro Tyr He Cys 
465 470 475 

GCA GGT ATC GGT ACT GAT TTA GTA TCC ATG TTT GAA GCT ACA AAT CCT 624 
Ala Gly He Gly Thr Asp Leu Val Ser Met Phe Glu Ala Thr Asn Pro 
480 485 490 495 

AAA ATT TCT TAC CAA GGA AAG TTA GGT TTA AGC TAC TCT ATA AGC CCA 672 
Lys He Ser Tyr Gin Gly Lys Leu Gly Leu Ser Tyr Ser He Ser Pro 
500 505 510 
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GAA GCT TCT GTG TTT ATT GGT GGG CAC TTT CAT AAG GTA ATA GGG AAC 
Glu Ala Ser Val Phe He Gly Gly His Phe His Lys Val He Gly Asn 
515 520 525 

GAA TTT AGA GAT ATT CCT ACT ATA ATA CCT ACT GGA TCA ACA CTT GCA 
Glu Phe Arg Asp He Pro Thr He He Pro Thr Gly Ser Thr Leu Ala 
530 535 540 

GGA AAA GGA AAC TAC CCT GCA ATA GTA ATA CTG GAT GTA TGC CAC TTT 
Gly Lys Gly Asn Tyr Pro Ala He Val He Leu Asp Val Cys His Phe 
545 550 555 



GGA ATA GAA ATG GGA GGA AGG TTT AA 
Gly He Glu Met Gly Gly Arg Phe 
560 565 



842 



(2) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 280 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 

Met Asn Tyr Lys Lys Ser Phe He Thr Ala He Asp He He Asn He 
1 5 10 15 

Leu Leu Leu Pro Gly Val Ser Phe Ser Asp Pro Arg Gin Val Val Val 
20 25 30 

He Asn Gly Asn Phe Tyr He Ser Gly Lys Tyr Asp Ala Lys Ala Ser 
35 40 45 

His Phe Gly Val Phe Ser Ala Lys Glu Glu Arg Asn Thr Thr Val Gly 
50 55 60 

Val Phe Gly Leu Lys Gin Asn Trp Asp Gly Ser Ala He Ser Asn Ser 
65 70 75 80 

Ser Pro Asn Asp Val Phe Thr Val Ser Asn Tyr Ser Phe Lys Tyr Glu 
85 90 95 

Asn Asn Pro Phe Leu Gly Phe Ala Gly Ala He Gly Tyr Ser Met Asp 
100 105 no 

Gly Pro Arg He Glu Leu Glu Val Ser Tyr Glu Thr Phe Asp Val Lys 
115 120 125 

Asn Gin Gly Asn Asn Tyr Lys Asn Glu Ala His Arg Tyr Cys Ala Leu 
130 135 i4o 
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Ser His Asn Ser Ala Ala Asp Met Ser Ser Ala Ser Asn Asn Phe Val 
145 150 155 160 

Phe Leu Lys Asn Glu Gly Leu Leu Asp He Ser Phe Met Leu Asn Ala 
165 170 175 

Cys Tyr Asp Val Val Gly Glu Gly He Pro Phe Ser Pro Tyr He Cys 
180 185 190 

Ala Gly He Gly Thr Asp Leu Val Ser Met Phe Glu Ala Thr Asn Pro 
195 200 205 

Lys He Ser Tyr Gin Gly Lys Leu Gly Leu Ser Tyr Ser He Ser Pro 
210 215 220 

Glu Ala Ser Val Phe He Gly Gly His Phe His Lys Val He Gly Asn 
225 2 30 235 240 

Glu Phe Arg Asp He Pro Thr He He Pro Thr Gly Ser Thr Leu Ala 
245 2 50 255 

Gly Lys Gly Asn Tyr Pro Ala He Val lie Leu Asp Val Cys His Phe 
260 265 270 

Gly He Glu Met Gly Gly Arg Phe 
275 280 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 849 base pairs 

(B) TYPE: nucleic acid 
<C) STRAND EDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..84.6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

ATG AAT TAC AGA GAA TTG TTT ACA GGG GGC CTG TCA GCA GCC ACA GTC 
Met Asn Tyr Arg Glu Leu Phe Thr Gly Gly Leu Ser Ala Ala Thr Val 
2 85 290 295 

TGC GCC TGC TCC CTA CTT GTT AGT GGG GCC GTA GTG GCA TCT CCC ATG 
Cys Ala Cys Ser Leu Leu Val Ser Gly Ala Val Val, Ala Ser Pro Met 
300 305 310 

AGT CAC GAA GTG GCT TCT GAA GGG GGA GTA ATG GGA GGT AGC TTT TAC 
Ser His Glu Val Ala Ser Glu Gly Gly Val Met Gly Gly Ser Phe Tyr 
315 320 325 



48 



96 



144 
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GTG GGT GCG GCC TAC AGC CCA GCA TTT CCT TCT GTT ACC TCG TTC GAC 
Val Gly Ala Ala Tyr Ser Pro Ala Phe Pro Ser Val Thr Ser Phe Asd 

335 340 

ATG CGT GAG TCA AGC AAA GAG ACC TCA TAC GTT AGA GGC TAT GAC AAG 
Met Arg Glu Ser Ser Lys Glu Thr Ser Tyr Val Arg Gly Tyr Asp Lys 
345 350 355 ' 360 

AGC ATT GCA ACG ATT GAT GTG AGT GTG CCA GCA AAC TTT TCC AAA TCT 
Ser He Ala Thr He Asp Val Ser Val Pro Ala Asn Phe Ser Lys Ser 
365 370 375 

GGC TAC ACT TTT GCC TTC TCT AAA AAC TTA ATC ACG TCT TTC GAC GGC 
Gly Tyr Thr Phe Ala Phe Ser Lys Asn Leu lie Thr Ser Phe Asp Gly 
380 3 8 5 390 

GCT GTG GGA TAT TCT CTG GGA GGA GCC AGA GTG GAA TTG GAA GCG AGC 
Ala Val Gly Tyr Ser Leu Gly Gly Ala Arg Val Glu Leu Glu Ala Ser 
395 400 405 

TAC AGA AGG TTT GCT ACT TTG GCG GAC GGG CAG TAC GCA AAA AGT GGT 
Tyr Arg Arg Phe Ala Thr Leu Ala Asp Gly Gin Tyr Ala Lys Ser Gly 
410 415 420 

GCG GAA TCT CTG GCA GCT ATT ACC CGC GAC GCT AAC ATT ACT GAG ACC 
Ala Glu Ser Leu Ala Ala He Thr Arg Asp Ala Asn He Thr Glu Thr 
425 «0 435 440 

ART TAC TTC GTA GTC AAA ATT GAT GAA ATC ACA AAC ACC TCA GTC ATG 
Asn Tyr Phe Val Val Lys He Asp Glu He Thr Asn Thr Ser Val Met 
44 5 450 455 

TTA AAT GGC TGC TAT GAC GTG CTG CAC ACA GAT TTA CCT GTG TCC CCG 
Leu Asn Gly Cys Tyr Asp Val Leu His Thr Asp Leu Pro Val Ser Pro 
460 465 470 

TAT GTA TGT GCC GGG ATA GGC GCA AGC TTT GTT GAC ATC TCT AAG CAA 
Tyr Val Cys Ala Gly He Gly Ala Ser Phe Val Asp He Ser Lys Gin 
475 48o 485 

GTA ACC ACA AAG CTG GCC TAC AGG GGC AAG GTT GGG ATT AGC TAC CAG 
Val Thr Thr Lys Leu Ala Tyr Arg Gly Lys Val Gly He Ser Tyr Gin 
490 495 500 

TTT ACT CCG GAA ATA TCC TTG GTG GCA GGT GGG TTC TAC CAC GGG CTA 
Phe Thr Pro Glu He Ser Leu Val Ala Gly Gly Phe Tyr His Gly Leu 
505 510 S15 * 520 



192 



240 



288 



336 



384 



432 



480 



528 



576 



624 



672 



720 



TTT GAT GAG TCT TAC AAG GAC ATT CCC GCA CAC AAC AGT GTA AAG TTC 768 
Phe Asp Glu Ser Tyr Lys Asp He Pro Ala His Asn Ser Val Lys Phe 
525 530 535 

TCT GGA GAA GCA AAA GCC TCA GTC AAA GCG CAT ATT GCT GAC TAC GGC fn* 
Ser Gly Glu Ala Lys Ala Ser Val Lys Ala His He Ala Asp Tyr Gly 
540 545 550 
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TTT AAC CTT GGA GCA AGA TTC CTG TTC AGC TAA 849 
Phe Asn Leu Gly Ala Arg Phe Leu Phe Ser 
555 560 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 282 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asn Tyr Arg Glu Leu Phe Thr Gly Gly Leu Ser Ala Ala Thr Val 
1 5 io 15 

Cys Ala Cys Ser Leu Leu Val Ser Gly Ala Val Val Ala Ser Pro Met 
20 25 30 

Ser His Glu Val Ala Ser Glu Gly Gly Val Met Gly Gly Ser Phe Tyr 
35 40 45 

Val Gly Ala Ala Tyr Ser Pro Ala Phe Pro Ser Val Thr Ser Phe Asp 
50 55 60 

Met Arg Glu Ser Ser Lys Glu Thr Ser Tyr Val Arg Gly Tyr Asp Lys 
65 70 75 80 

Ser lie Ala Thr He Asp Val Ser Val Pro Ala Asn Phe Ser Lys Ser 
85 go 95 

Gly Tyr Thr Phe Ala Phe Ser Lys Asn Leu He Thr Ser Phe Asp Gly 
100 105 no 

Ala Val Gly Tyr Ser Leu Gly Gly Ala Arg Val Glu Leu Glu Ala Ser 
115 120 125 

Tyr Arg Arg Phe Ala Thr Leu Ala Asp Gly Gin Tyr Ala Lys Ser Gly 
130 135 140 

Ala Glu Ser Leu Ala Ala He Thr Arg Asp Ala Asn He Thr Glu Thr 
145 150 155 160 

Asn Tyr Phe Val Val Lys He Asp Glu He Thr Asn Thr Ser Val Met 
165 170 175 

Leu Asn Gly Cys Tyr Asp Val Leu His Thr Asp Leu Pro Val Ser Pro 
180 185 190 

Tyr Val Cys Ala Gly He Gly Ala Ser Phe Val Asp He Ser Lys Gin 
195 200 205 
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Val Thr Thr Lys Leu Ala Tyr Arg Gly Lys Val Gly He Ser Tyr Gin 
210 215 220 

Phe Thr Pro Glu He Ser Leu Val Ala Gly Gly Phe Tyr His Gly Leu 
225 230 235 240 

Phe Asp Glu Ser Tyr Lys Asp He Pro Ala His Asn Ser Val Lys Phe 
245 250 255 

Ser Gly Glu Ala Lys Ala Ser Val Lys Ala His He Ala Asp Tyr Gly 
260 265 270 

Phe Asn Leu Gly Ala Arg Phe Leu Phe Ser 
275 280 
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Claims 

1 1 . A composition comprising a polynucleotide which encodes a polypeptide having the 

2 characteristic of eliciting an immune response protective against disease or death caused by a 

3 rickettsial pathogen. 

1 2. The composition, according to claim 1, wherein said rickettsial pathogen is selected 

2 from the group consisting of Rickettsia spp., Ehrlichia spp., Anaplasma spp., and Cowdria spp. 

1 3. The composition, according to claim 1, wherein said polypeptide has an amino acid 

2 sequence selected from the group consisting ofSEQ ID NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, 

3 SEQ ID NO. 14, SEQ ID NO. 15, SEQ ID NOS. 16-20, SEQ ID NO. 23, and SEQ ID NO. 24, 

4 or a fragment thereof. 

1 4. The composition, according to claim 1, wherein said polynucleotide has a nucleic 

2 acid sequence selected from the group consisting of SEQ ID NO. 1 , SEQ ID NO. 3, SEQ ID NO. 

3 5,SEQIDN0.7,SEQIDN0.8,SEQIDNOS.9-13,SEQIDNO. 21, and SEQ ID NO. 22, 

4 or a fragment thereof. 

1 5. The composition, according to claim 4, wherein said polynucleotide has a nucleic 

2 acid sequence of SEQ ID NO. 3, or a fragment thereof. 

1 6. The composition, according to claim 1, wherein said polynucleotide further 

2 comprises a nucleic acid vaccine vector. 

1 7. The composition, according to claim 1, further comprising a pharmaceuticalty 

2 acceptable carrier. 

1 8. A polynucleotide encoding a polypeptide having an amino acid sequence selected 

2 from the group consisting of SEQ ID NO. 4, SEQ ID NOS. 14-20, SEQ ID NOS. 23-24, and 

3 fragments thereof. 
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9. The polynucleotide, according to claim 8, said polynucleotide having a nucleic acid 
sequence selected from the group consisting ofSEQ ID NO. 3, SEQ ID.NOS. 7-13, and SEQ 
IDNOS. 21-22. 

10. A method for protecting a susceptible animal host against disease or death caused 
by a rickettsial pathogen, said method comprising administering an effective amount of a 
polynucleotide encoding polypeptide having the characteristic of eliciting an immune response 
protective against said rickettsial pathogen. 

1 1. The method, according to claim 10, wherein said rickettsial pathogen is selected 
from the group consisting of Rickettsia spp., Ehrlichia spp., Anaplasma spp., and Cowdria spp. 

12. The method, according to claim 10, wherein said polypeptide has an amino acid 
sequence selected from the group consisting of SEQ ID NO. 2, SEQ ID NO. 4, SEQ ID NO. 6, 

SEQIDN0.14,SEQIDN0.15,SEQ IDNOS. 16-20, SEQ ID NO. 23, and SEQ ID NO. 24, 
or a fragment thereof. 

13. The method, according to claim 10, wherein said polynucleotide has a nucleic acid 
sequence selected from the group consisting of SEQ ED NO. 1, SEQ ID NO. 3, SEQ ID NO. 5, 
SEQ ID NO. 7, SEQ ID NO. 8, SEQ ID NOS. 9-13, SEQ ID NO. 21, and SEQ ID NO. 22. 

14. The method, according to claim 13, wherein said polynucleotide has the nucleic acid 
sequence of SEQ ID NO. 1 . 

15. The method, according to claim 13, wherein said polynucleotide has the nucleic acid 
sequence of SEQ ID NO. 3. 

16. The method, according to claim 13, wherein said polynucleotide has the nucleic acid 
sequence of SEQ ID NO. 5. 



17. The method, according to claim 10, wherein said nucleic acid further comprises 
appropriate nucleic acid vector. 
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1 1 8. The method, according to claim 1 0, wherein said composition further comprises a 

2 pharmaceutical^ acceptable carrier. 

1 19. A method for detecting, in a human or animal, antibodies associated with infection 

2 by Ehrlichia, wherein said method comprises contacting a biological fluid from said human or 

3 animal with a polypeptide selected from the group consisting of SEQ ID NO. 4, SEQ ID NOS. 

4 14-20, SEQ ID NOS. 23-24, and fragments thereof. 
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1 ggaacgaat^agggacatzrctaccctcaaagcgtccgctacaccaccanccgcagcta 
NEFRDISTLKAFATPSSAAT 
51 ccccagacccagcaacagtaacactgagtgcgtgccac nctggagcagaacccggaggaa 
p DLATV? L S V C H F6V SLG G R 
121 cactraacctccaacrzcaccaccgccacacgctaaaaacaatccaaaccrgcrtrcat-t 
r N r * 

181 •tttgcc«caaacaaacaaaacageggc»a«a«aatgragcaacaa qaQocccggooQooa 

241 ccaacracratccgccatatcccrraccaccacrracac^aaacaacccgacaaacacaa 
301 cagcsccgggagaaataaacaacacscaaacctctcsCacaaaaacc afctfcatA Ccscgc 

-35 

361 acraaaaaccagct:t«t*meccgcrcrtacactgT^ggccraccaccgccaatt:tgcctc 
-10 

421 cact^acccc*22*£caacacgaaccgcgaaaaatttccracaacaaccgcac"zaacacca 

RBS MNCEKFFITTALTL 
481 ctaatgcccctcctacctggaacaccaccttccgatccagcacaggatgacaacattagc 

LMS PLPGISLSDPVQ DDNIS 
541 ggcaaccrccacaccagcggaaagcacacgccaagcgccccgcactctggagtrtttccc 

GNPYISGKYMPSASHPGVFS 
€01 gccaaggaagaaagaaacacaacagctggagtatctiggaacagagcaagactgggacaga 

AKE2RNTTVGVFG IEQ0WDR 
661 tgcgtaacacctagaaccac rrsaagcgatacacscaccgctcsaa a tta £sca cttaaa 

CVZSRTTXiSDZFTVPN "t S F K 
721 jacoaaaacaatccatsctcaggatiitigcaggagcraccggccacccaa tggatggccca 

YENNLFSGFAGA IGYSMDGP 
781 agaacagagcccgaagcatcrTacgaagcacccgatgct^aaaaccaaggcaacaatrac 

RIELEVSYEAPDVKNQGNNY 
841 aagaacgaagcacatagacattacgctctgccccatcctcccggcacagagacacagata 

K NEAHRYYALSHLLGTETQI 
901 gacggcgcaggcagcgcgcccgrcccccraacaaacgaaggactacrcgacaaaccactt 

DGAG.SASVFL X NEGL LDKSP 
961 acgccgaacgcacgttacgacgtaacaagcgaaggcataccccttscscccracatatgc 
MLNA CYDVI $ E G I PFSPYIC 
1021 gcaggtactggtaccgatscagtaccca cacttaaaacsacaaatccr aaaacttecrat 

AGIGIDLVSMFEAINPKISY 
1081 caaggaaaactaggcctaagtcaccccacaagcccagaagctrticcgcgctcatcggtgga 

QGKLGLSYPI S PEASVFIGG 
11 41 caccctcacaaggcgacaggaaacgaatr tagagacacccct^ccacga cacc tagtgaa 

HFHKVIGNEFRDI PTMI ? S E 
1201 tcagcgcctgcaggaaaaggaaactaccetgcaacagtaacactggacgcgctccaccct 

SALAGKGNY PA IVTLDVFY F 
12 61 ggcatagaacctggaggaaggcrtaaccrccaactt Cgattefctgaeacaacaaaeaaaa 

GIEL GGR F NPQL* 
1321 acaqtqg ra aa aa aacgtogcaat^ agagggQgqaoqaqoga acraaattatcacttace 
1381 acaccccttaccaccacctacaccaaataacccgacaaacacaacagctcaaacaaaggt: 
1441 aaacaactctraaaettgccstacgagaacc a<:t?g»tt* cct:cacaccaaaaactaqctt» 

-35 

1501 t»«£tt:gtctt:racattgcagccccactacrgttaatttattcccaccaz:^t caggt:g ca 

-10 RBS 
1561 acacgaactgcaaaaaacrcrctat^acaactgcacragcaccactaacgtcccttccac 

MNCKKPPITTALVSI# MSFLP 
1621 ccggaataccacrrcctgacccagtgcaaggcgacaataccagtggcaacrtccacgcta 

GISFSDPVQG D NISGNFYVS 
1681 gtggcaagcacatgccaagtgctrcogcatcttggcatgctcrccgccaaagaagaaaaaa 

GKYMPSASHFGMFSAKEEKN 
1741 atcctaccgccgcactgtacggctraaaacaagatcgggaagggaccagctcatcaagcc 

PTVALYGLKQDWEGISSSSH 
1801 acaacgataatcacttcaataacaagg g t ta ctca cscaaa ta toaaa ataacccactct 

NDNHFNNKG Y $ F K Y E N N P F L 
18 61 tagggcctgcaggagctattggttactxaatgggtggtccaagagcagagtttgaagtgt 

G FAGA I G Y S MG G P RVEF E V S 
1921 cctatgaaacatrtgacgtcaaaaatcagggtaataactataaaaatgatgctcacagat 

YETFDVKNQGNNYKNDAHRY 
1981 accgtgctrtaggccaacaagacaacagcggaatacccaaaactagtaaatacgtactgt 

CALGQQONSG I PKTSKYVLL 
2041 taaaaagcgaaggactgcctgacacatcactitacgccaaacgcatgccacgacacaataa 

KSEGLLOISFMLNACYDI IN 
2101 acgagagcatacctttgtctcctt^cacatgtgcaggtgctggcActgatrcaatatcca 

BSIPLSPYZCAGVGTDLISM 
21 61 tgrctqaaactacaaaCcc raaaatSCctlt^cc^QoaAaactiaoargtaaqt-ra^r^ra 

FEATN PKI SY QG K L G L S Y S I 
2221 taaacccagaagcctctgcacctacrggcggacacntccataaggcgacaggaaacgaac 

MPEASVFIGGHF HKVIGNEF 
2281 ctagggacattcccaecccgaaagcacctgntacgtcaccagccactccagacctagcaa 

rdiptlkafvtssatpdla: 
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23 41 cagcaacact^agcgcatgceacccnggaacagaacttggaggaaggtccaacrrccaac 

V TI»SVCH?GISLGGRrN?» 
2401 cccgctacrgccacacgttaaaaacaaccraaacccgtcctcatt«ttgct«c*ataaac 
2461 aaaaacmqtqg c w « « >g aacgcagcaaca agaaoqoooaoCTOooQa ctaaat^gctat:^rr 
2521 accacaccccczaccacaccacccacactaaataacctgacaaacacaacagccrccgga 
2581 ae^acaaacaacacctaaacta;ccctacaaaaaccactt«it«cct:t:gcaccaaaaacra 

-35 

2641 gcctatjuaccrgcctzcacacrgnagccccaccattgccaactraccztcaccacrrrma 
-10 — 

2701 sCJcaacacgaaccgcaaaaaaccccrtacaacaacT^cacragraccgcraacgtc-t:*- 
RBS MMCKKFP I T.TT&YSLMSP 
2761 ccracccggaacaccacrsticcgatgcagcacagaacgacaacgctggcggtaactccra 

LPGI SPSDAVQHD-NVGGSP? 
2821 ^atcagtgggaaacacgcaccaagtgrtticacactcrggcgcactcrccgctaaacacga 

I S.G K TV P S V S.H FGVFSA K 0 2 
2881 aagaaacacaacaatcggagcatttggacraaagcaagatrgggatggcagcacaatatc 

RN.TTIGV FGLKQDWDGSTI S 
2941 caaaaatca tscagaaaataca tttaacg cticcaa a c fca c cca cr ca aa fca csaaa acaa. 

KKS P ENT PNVPN Y S P 5C Y 2 N N 
3001 cccacccctaggtcctgcaggagctigcrggrtatts^acgaacggccicaagaacagagct 

PPLG F AGAVGTLMMGPRI51. 
3061 agaaatgtxecacgaaacacrtgatgcgaaaaaccagggcaataactacaagaacgatgc 

BMSY BTFDVKM Q G N N Y K N D A 
3121 tcacaaacacracgcCTicaacccataacagtgggggaaagctaagcaacgcaggcgacaa 

H K.YYALTHNSGG-KLS8A G OK 
3181 gcrt:g«tctccaaaaaacgaaggaccact:rgatacaccacctacgccgaacgcat:gcca 

P V P L . K NEGLLD ISLMLNACY 
3241 tgacgcaacaagtigaaggaacacccctccctccctacacatgcgeaggcgttggcaccga 

DVISEG1 PPSPY ICACVGTO 
3301 ccraacaccea zaxgtgaagcsa caaaccs taaaattCcttatsaaqqaaaggraoogrr 

L I«S MFEAINPKISYQGKLGI* 
33 6 1 gagctacrccataagcccagaagcttctgctttcgccggcggacatrcrcataaggcgac 

SY.SI SPEA SVPVGGKPHKVI 
3421 agggaatgaactcagagacattcccgctatgacacceagtaccrcaactctcacaggcaa 

GN.E PRD I PAMI PSTSTLTGN 
3481 tcaccttacracagtaacaccaagtgnatgccaccrtggagtggaaccrggaggaaggtt 
H F.T I V T I* S V C HFGV2LGGRF 

3541 taacttttuiactitrattaccgccacacgttaaaaataacccaaacttot^titactartg 
n p: * 

3601 c tgcaggtamm t»« BM« r»g t:agcaaaag^cgtagcaaca acaaqQqqqaaaaa ac!^q 
3661 ctracaagcgccgcrcctcccacccttacacatgatacracacccaaccagcczrcccgc 
3721 i^ttaettacccgacgcaacacatt^aatrttcecsacaaaagct aceyagac tscataG 

-35 

3781 aaaaact^tattecgacccgccccracacgacacctccactatcgrtaacrt^cttgcc 
-10 

3841 actact«22ttatatacgaact^caaaaaagttttcataacaagtgcactgacaccatta 

RBS MNYKKVP I T S A L I S L 
3901 acatctcccctacctggagcaccacrctccgacccaocaggtagcggcatcaacggcaat 

I SSLPGVSPSDPAGSGINGN 
3961 tcccacaccagcggaaaacacatgccaagcgcctcgcacttcggagtactctccgctaag 

FYISGKYM PSA SHPCVFSAK 
4021 gaagaaagaaacacaacagttggagtgtttggaccgaagcaaaattgggacggaagcgca 

EERNTTVGVFGL K Q N W 0 G 5 A 
4081 acatccaac tcctccccaaacgatgtattcactgtctcaa a ttactca. c ztaaaca fccaa 

ISNSSPNDVPTVS N TTTTTT" 
41 41 aacaacccgtctxtaggc tttgcaggagc ta ttgg ttactcaatggatgg tccaagaata 

NHP FLGFAGA XGYSMDGPRZ 
42 01 gagcrtgaagtatcttangaaacatttgacgtaaaaaaccaaggcaacaactataagaat 

E L E V S Y E T F D V K N Q G N N Y K N 

42 61 gaagcacacagatactgtgctctatcccataac tcagcagcagacatgagtagtgcaagt 

EAHRYCAI* S H NSAADMSSAS 
4321 aacaatcttgcctttctaaaaaatgaaggactacttgacataccacrtatgctgaacgca 
NNFVFLKNBGLLDISFMLNA 

43 81 tgctatgacgcagtaggcgaaggcatacctttttctccccatatacgcgcaggtatcggc 

CYDVVGEGIPFSPYICAGIG 
4441 accga cctagca tccat arc z ccraatrc eacaaa tcct aaaatttcsEacsaaaaaaaot^a 

T 0 L V S « F E A T N P K I S Y Q G K L 
4501 ggcttaagctactccacaagcccagaagcctccgtgcctaccggcgggcactctcataag 

GLSYSISPEASVFIGGHFHK 
4561 gtaa tagggaacgaac ttagagaeactcctacta caacacc cac cggaccaacac ttgca 

VIGNEPROIP T I I PTGSTLA 
4621 ggaaaaggaaaccaccctgcaatagcaatactggacgtatgccactttggaatagaaatg 

G KG NY PA I VI LDVCHFGI EM 
4681 gga 

G 
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1 ^gygy taaa ta t gaaa ta caaaaaaac t z z cac ag caac cgcacragta ctac-aac etc 
RBS HKYKKTPTVTALVLLTS 
61 cctcacacaccttatacccctCwacagcccagcacqtgccagtacaactcacaacrticra 

fthpipfysparastx h n ? y 
121 cac^gtggaaaatacatgccaacagcgtcacatcccggaatctsctcagccaaagaaga 

ISGKYMPTASHFGIFSAKSE 
181 acaaagccctactaaggtattagcrgggctagaccaacgaczatcacataacacracaaa 

QS?TKVLVGI#DQ R L5 H N I IN 
241 caacaacgacacagcaaagagccrraaggcccaaaatcatccaccraaatacaaaaacaa 

NNDTAK SLKVQNYSFKYKNN 
301 cccantrccaggatctgcaggagczactggttacr caa taggcaactcaagaacagaac t 

PFLGFAGAICYSIGNS RI EL 
361 agaagcaccacatgaaatacrrgaLactaaaaacccaagaaacaacrac^raaacgaccc 

BVSHE2. PDTKNPGMHTLNDS 
421 tcacaaacattgcgctttatcteacggaagtxacatatgcagtgacggaaacagcogaga 

HKYCALSMGSHIC. SDCNSC O 
481 ttggtacactgcaaaaactgataagtccgtacctccgaaaaatgaaggrrcacrtgaegc 

WYTAKTDKPVLLKNEGLLDV 
541 cccacr catgct^aacgcatgrtatgacataacaac tgaaaaaa tgecer t crcacc tza. 

S.FHLH.ACYD I T T E K M PF S PY 
601 Cacacg^gcaggtattggcactgatctcatatctatgcrtgagacaacacaaaacaaaac 

I CAG I GTDLI SMFETTQ N KI 
661 accctatcaaggaaagttaggcttaaaccatacracaaacccaagagcrrcrgccrtt^c 

SYQ GKLGLNYTINSRVS VFA 
721 aggtgggcactttcataaggcaataggtaatgaatttaaaggtactcccacrctactacc 

GGHFHKVIGNEF K G I P T L L P 
781 tgacggatcaaacattaaagtacaacagtctgcaacagtaacact^gatgcgcgccacrc 

DG5NI KVQQSATVT LOVCH? 
841 cgggct^gagattggaagtagatttrtctitraaracnrccactgtacacgcraaaaata 

GX.E XGSRFFF* 

901 gt^ctagcrtgcttctgtggcctacaaacgcaagagagaaatagctagcaacaaatraga 
961 aagccaaatacragaaaagtcatatgtttttcatcgccactgatactcaacraaaagtag 
1021 t*caaatgtt«cttactaacaatt£t:acgcagcataccaaattrcccttacaaaagccac 
1081 t*gt^tt:ttatactaaaacTct>t»ctt:tqqct:t.qtatct^att:tiCTrat:t:r.r_ra^ranrrTt- 

-35 -10 
1141 t^attcaccctcactgtctctjgtjcaaacatgaactgtaaaaaagttttcacaataagt 

RBS MNC K KVFTIS 

1201 gcattgatatcatccatatacttectaectaatgtxtcacactctaacccagcatatggc 

ALISSI7FLPNVSYSNPVYG 
1261 aacagcatgtatggtaattrttacatatcaggaaagtacatgccaagtgtrcctcacttc 

NSMYGNPY I SGKYMPSVPHF 
1321 ggaactccttcagctgaagaagagaaaaaaaagacaactgtagtatatggcrraaaagaa 

G I FSABEBK KKTTV VYGLKE 
13 81 aactgggcaggagacgcaatatctagtcaaagtccagatga taattttaccactcgaaat 

NWAGDAISSQSPDONPT1 R N 
1441 tactcattcaagtatgcaagcaacaagtttrtagggcttgcagtagctactggctactcg 

YSPKYASNKFLGPAVA1GYS 
1501 anaggcagrccaagaatagaagttgagatgtcttatgaagcacttgatgtaaaaaatcaa 

IGSPRIEVEMSYEAF DVKNQ 
1561 ggcaacaatt 

G fl N 
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1 acatgtatacattatagtaacaaatgttaccgtattttattcataagttaagtaaaatct 

61 ataccattctctttcactttatcagaagacttttatttatcacaaactcatgacgtatag 

121 tgtcacaaataaacacactgcaactgcaatcactacgtaaaactttaactcttctttttc 

181 acaactaaaatactaataaaagtaatatagtataaaaaatcttaagtaacTIGacataat 

attactctgatalMsaiatgtctagtatctctatactaaacgtttatataatt^^ca 

tattaATGAAAGCTATI^AATTCMACTTAATCTCTGCTTACTJCTTTK 

MKAIKFILNVCLLF A-* A I F L 

TA^GTATTCC^ATTACAAAACA^^ 

GYSYITKQGIFQTKHHDTPN 

ATACTACTATACCAAAT<^GACGGTATTCAATCTAGCTO 
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